DISCO: Describing Images Using Scene Contexts and Objects

نویسندگان

  • Ifeoma Nwogu
  • Yingbo Zhou
  • Christopher Brown
چکیده

In this paper, we propose a bottom-up approach to generating short descriptive sentences from images, to enhance scene understanding. We demonstrate automatic methods for mapping the visual content in an image to natural spoken or written language. We also introduce a human-in-the-loop evaluation strategy that quantitatively captures the meaningfulness of the generated sentences. We recorded a correctness rate of 60.34% when human users were asked to judge the meaningfulness of the sentences generated from relatively challenging images. Also, our automatic methods compared well with the state-of-the-art techniques for the related computer vision tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Color scene transform between images using Rosenfeld-Kak histogram matching method

In digital color imaging, it is of interest to transform the color scene of an image to the other. Some attempts have been done in this case using, for example, lαβ color space, principal component analysis and recently histogram rescaling method. In this research, a novel method is proposed based on the Resenfeld and Kak histogram matching algorithm. It is suggested that to transform the color...

متن کامل

Heightened Responses of the Parahippocampal and Retrosplenial Cortices during Contextualized Recognition of Congruent Objects

Context sometimes helps make objects more recognizable. Previous studies using functional magnetic resonance imaging (fMRI) have examined regional neural activity when objects have strong or weak associations with their contexts. Such studies have demonstrated that activity in the parahippocampal cortex (PHC) generally corresponds with strong associations between objects and their spatial conte...

متن کامل

A DisCo: Displays that Communicate

We present DisCo, a novel display-camera communication system. DisCo enables displays and cameras to communicate with each other, while also displaying and capturing images for human consumption. Messages are transmitted by temporally modulating the display brightness at high frequencies so that they are imperceptible to humans. Messages are received by a rolling shutter camera which converts t...

متن کامل

Generating Image Captions using Topic Focused Multi-document Summarization

In the near future digital cameras will come standardly equipped with GPS and compass and will automatically add global position and direction information to the metadata of every picture taken. Can we use this information, together with information from geographical information systems and the Web more generally, to caption images automatically? This challenge is being pursued in the TRIPOD pr...

متن کامل

An Evaluation on Color Invariant Based Local Spatiotemporal Features for Action Recognition

Despite recent advances in the design of features to improve automated human action recognition, color information has so far been overlooked. Nevertheless, color has been proven an important element to the success of automated recognition of objects/scenes and segmentation. For object and scene recognition in static images, robustness to photometric variations has been achieved by describing l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011